A Comparison of Approaches to Word Class Tagging: Disjunctively vs. Conjunctively Written Bantu Languages*
نویسندگان
چکیده
Northern Sotho and Zulu are two South African Bantu languages that make use of different writing systems, viz. a disjunctive and a conjunctive writing system respectively. In this article it is argued that the different orthographic systems obscure the morphological similarities and that these systems impact directly on word class tagging for the two languages. It is illustrated that not only different approaches are needed for word class tagging, but also that the sequencing of tasks is to a large extent determined by the difference in writing systems.
منابع مشابه
Setswana Tokenisation and Computational Verb Morphology: Facing the Challenge of a Disjunctive Orthography
Prefixes of the Setswana verb • The subject agreement morphemes, written disjunctively, include non-consecutive subject agreement morphemes and consecutive subject agreement morphemes. For example, the non-consecutive subject agreement morpheme for class 5 is le as in lekau le a tshega (the young man is laughing), while the consecutive subject agreement morpheme for class 5 is la as in lekau la...
متن کاملUser-friendly Dictionaries for Zulu: An Exercise in Complexicography
In this paper the main features of Bantu lexicography are analysed through several case studies of Zulu dictionary features. Examples from both existing dictionaries as well as a forthcoming reference work are used in the analysis, which develops from verbs and nouns, gradually including more word classes, and ending with a detailed study of possessive pronouns. The latter serves as one example...
متن کاملFinite state tokenisation of an orthographical disjunctive agglutinative language: The verbal segment of Northern Sotho
Tokenisation is an important first pre-processing step required to adequately test finite-state morphological analysers. In agglutinative languages each morpheme is concatinatively added on to form a complete morphological structure. Disjunctive agglutinative languages like Northern Sotho write these morphemes, for certain morphological categories only, as separate words separated by spaces or ...
متن کاملKappa-Join: Efficient Execution of Existential Quantification in XML Query Languages
XML query languages feature powerful primitives for formulating queries involving comparison expressions which are existentially quantified. If such comparisons involve several scopes, they are correlated, and become difficult to evaluate efficiently. In this paper, we develop a new ternary operator, called Kappa-Join, for efficiently evaluating queries with existential quantification. In XML q...
متن کاملThe Consequences of the Contacts between Bantu and Non-Bantu Languages around Lake Eyasi in Northern Tanzania
In rural Tanzania, recent major influences happen between Kiswahili and English to ethnic languages rather than ethnic languages, which had been in contact for so long, influencing each other. In this work, I report the results of investigation of lexical changes in indigenous languages that aimed at examining how ethnic communities and their languages, namely Cushitic Iraqw, Nilotic Datooga, N...
متن کامل